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Abstract 

We  suggest  an  information-theoretic  approach  for  measuring  stylistic  coordination  in  dia¬ 
logues.  The  proposed  measure  has  a  simple  predictive  interpretation  and  can  account  for 
various  confounding  factors  through  proper  conditioning.  We  revisit  some  of  the  previous 
studies  that  reported  strong  signatures  of  stylistic  accommodation,  and  find  that  a  significant 
part  of  the  observed  coordination  can  be  attributed  to  a  simple  confounding  effect — length 
coordination.  Specifically,  longer  utterances  tend  to  be  followed  by  longer  responses,  which 
gives  rise  to  spurious  correlations  in  the  other  stylistic  features.  We  propose  a  test  to  distin¬ 
guish  correlations  in  length  due  to  contextual  factors  (topic  of  conversation,  user  verbosity, 
etc.)  and  turn-by-turn  coordination.  We  also  suggest  a  test  to  identify  whether  stylistic  coor¬ 
dination  persists  even  after  accounting  for  length  coordination  and  contextual  factors. 


Introduction 

Communication  Accommodation  Theory  [1]  states  that  people  tend  to  adapt  their  communi¬ 
cation  style  (voice,  gestures,  word  choice,  etc.)  in  response  to  the  person  with  whom  they  inter¬ 
act.  Originally,  experiments  on  linguistic  accommodation  were  confined  to  small  scale 
laboratory  settings  with  a  handful  of  participants.  The  recent  proliferation  of  digital  (or  digi¬ 
tized)  communication  data  offers  an  opportunity  to  study  nuances  of  human  communication 
behavior  on  much  larger  scales.  A  number  of  recent  studies  have  indicated  presence  of  stylistic 
coordination  in  communication  [2-5],  where  one  person’s  use  of  a  linguistic  feature  (e.g.  prep¬ 
ositions)  increases  the  probability  that  a  response  will  include  the  same  feature.  Linguistic  style 
coordination  (or  matching )  has  been  used  to  predict  relationship  stability  [2]  and  negotiation 
outcomes  [6],  understand  group  cohesiveness  [3],  and  infer  relative  social  status  and  power 
relationships  among  individuals  [5]. 

Most  reports  of  linguistic  style  coordination  have  been  based  on  correlational  analysis. 

Thus,  such  claims  are  susceptible  to  various  confounding  effects.  For  instance,  it  is  known  that 
there  is  significant  length  coordination  in  dialogues,  in  the  sense  that  a  longer  utterance  from 
user  Y  tends  to  solicit  a  longer  response  from  user  X  [7].  Thus,  if  the  probability  of  an  utterance 
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containing  a  feature,  e.g.  prepositions  or  words  whose  second  letter  is  “r”,  depends  only  on 
length,  this  will  create  the  illusion  of  stylistic  coordination  on  the  given  feature. 

Here  we  attempt  to  remedy  the  problem  and  propose  an  information-theoretic  framework 
for  characterizing  stylistic  coordination  in  dialogues.  Namely,  given  a  temporally  ordered 
sequence  of  utterances  (verbal  or  electronic  statements  depending  on  the  context)  by  two  indi¬ 
viduals,  we  characterize  their  stylistic  coordination  with  time-shifted  mutual  information.  The 
proposed  coordination  measure  characterizes  the  dependence  between  the  stylistic  features  of 
the  original  post  and  the  response.  In  addition,  we  provide  a  computational  framework  to 
account  for  confounders  when  measuring  stylistic  coordination. 

We  revisit  some  of  the  case  studies  where  linguistic  coordination  was  reported  and  demon¬ 
strate  that  a  significant  part  of  the  observed  correlations  in  linguistic  features  can  indeed  be 
explained  by  length  coordination  rather  than  stylistic  accommodation.  In  particular,  most  sty¬ 
listic  features  that  exhibit  statistically  significant  correlation  exhibit  little  to  no  correlation  after 
length  coordination  has  been  taken  into  account. 

We  also  focus  on  the  observed  length  correlations,  and  examine  whether  it  is  due  to  turn- 
by-turn  coordination  between  the  participants,  or  can  be  attributed  to  other  contextual  factors. 
We  construct  a  statistical  permutation  test  and  demonstrate  unequivocally  that  turn-by-turn 
length  coordination  in  dialogues  indeed  takes  place.  Finally,  we  develop  a  similar  test  for  stylis¬ 
tic  features,  and  demonstrate  that  at  least  for  one  of  the  datasets,  the  remnant  coordination 
(after  conditioning  on  length)  cannot  be  explained  by  contextual  factors  alone  and  has  to  be 
due  to  turn-by-turn  level  coordination  between  the  speakers. 

Measuring  Stylistic  Coordination 

Representing  Stylistic  Features 

To  represent  stylistic  features  in  utterances,  we  use  Linguistic  Inquiry  Word  Count  (LIWC) 

[8],  which  is  a  dictionary-based  encoding  scheme  that  has  been  used  extensively  for  evaluating 
emotional  and  psychological  dimensions  in  various  text  corpora.  The  latest  version  of  the 
LIWC  dictionary  contains  around  4500  words  and  word  stems.  Each  word  or  word  stem 
belongs  to  one  or  more  word  categories  or  subcategories.  Various  LIWC  categories  include 
positive  and  negative  emotion,  function  words,  pronouns,  articles,  and  so  on.  Here  we  focus  on 
eight  LIWC  categories  that  have  been  used  in  previous  studies  [5]:  articles,  auxiliary  verbs,  con¬ 
junctions,  high-frequency  adverbs,  impersonal  pronouns,  personal  pronouns,  prepositions, 
and  quantifiers.  Utterances  are  represented  as  eight-component  binary  vectors  indicating  the 
presence  or  absence  of  each  linguistic  marker  [5]. 

Information-theoretic  measure  of  coordination 

Each  dialogue  is  a  sequence  of  utterance  exchanges  between  two  participants.  Following  [4,  5] 
we  binarize  the  stylistic  features  of  utterances,  so  that  a  dialogue  is  represented  as  jo™,  rf},=1, 
where  o™,  r™  =  {0, 1}  indicate  the  absence  or  presence  of  the  stylistic  marker  m,  and  K  is  the 
total  number  of  exchanges  in  a  dialogue.  Since  we  focus  on  coordination  between  the  same  sty¬ 
listic  markers,  we  will  drop  the  superscript  m  from  now  on.  We  use  the  convention  O  to  repre¬ 
sent  the  originator — the  person  who  is  producing  the  original  utterance  in  a  single  exchange,  R 
to  represent  the  respondent — the  person  who  is  replying  to  the  originator. 

Let  p(o,  r)  be  the  joint  distribution  of  the  random  variables  O  and  R.  We  characterize  the 
amount  of  stylistic  coordination  using  Mutual  Information  (MI)  [9];  see  SI  File  for  a  brief  over¬ 
view  of  basic  information  theoretic  concepts: 

7(0  :  R)  =  H{0)  -  H(0\R)  (1) 
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where  77(0)  =  -Ip(o)  logp(o)  is  the  Shannon  entropy  of  O,  and  H(0\R)  is  the  entropy  of  O 
conditioned  on  R.  Note  that  in  our  case  the  arguments  are  temporarily  ordered:  O  is  always  the 
initial  utterance,  and  R  is  the  response,  so  that  Eq  1  in  fact  defines  time-shifted  mutual  informa¬ 
tion.  Thus,  even  though  MI  is  symmetric  with  respect  to  its  argument,  the  coordination 
between  two  users  may  be  asymmetric. 

Recall  that  mutual  information  between  two  variables  measures  the  average  reduction  in  the 
uncertainty  of  one  variable,  if  we  know  the  other  variable.  Thus,  in  essence,  the  proposed  mea¬ 
sure  of  stylistic  coordination  quantifies  how  the  use  of  a  marker  m  in  an  utterance  of  O’ s  can 
help  to  predict  R’s  usage  of  m  in  the  immediate  response.  In  contrast  to  linear  correlation  mea¬ 
sures,  mutual  information  is  well  suited  for  handling  strongly  non-linear  dependencies. 

We  measure  the  correlation  between  two  variables  after  conditioning  on  a  third  variable,  Z, 
via  Conditional  Mutual  Information,  defined  as 

7(0  :  R\Z)  =  H(0\Z )  -  H(0\R,Z).  (2) 

Below  we  will  use  CMI  to  account  for  the  confounding  effect  of  the  utterance  length  by  condi¬ 
tioning  on  it.  Namely,  the  actual  stylistic  accommodation,  after  accounting  for  the  length  coor¬ 
dination,  is  given  by  7(0  :  R|Lfi),  where  LR  is  the  length  of  the  utterance  by  user  R. 

Estimating  mutual  information  from  data 

Given  a  set  of  samples  {ok,  rk}k=1,  our  goal  is  to  estimate  mutual  information  between  O  and  R. 
We  could  do  this  by  first  calculating  the  empirical  distribution  p(o,  r)  and  then  using  Eq  1. 
However,  it  is  known  that  this  naive  plug-in  estimator  tends  to  underestimate  the  entropy  of  a 
system.  Instead,  here  we  use  the  statistical  bootstrap  method  introduced  by  DeDeo  et  al.  [10], 
which  attempts  to  reduce  the  bias  of  the  naive  estimator  by  estimating  a  bootstrap  correction 
term.  The  estimate  of  bias  comes  from  comparing  the  entropy  of  the  empirical  distribution  to 
estimates  of  entropy  from  several  bootstrap  datasets  drawn  randomly  according  to  the  empiri¬ 
cal  distribution.  See  [10]  for  more  details. 

While  the  above  discrete  estimator  works  well  for  evaluating  mutual  information  between 
discrete  stylistic  variables,  it  is  not  very  useful  for  evaluating  mutual  information  between  two 
length  variables,  due  to  limited  number  of  samples  we  have.  Instead,  we  will  use  a  continuous 
estimator  introduced  by  Kraskov  et  al.  [11].  This  non-parametric  estimator  searches  the  k- 
nearest  neighbors  to  each  point,  and  then  average  the  mutual  information  estimated  from  the 
neighborhood  of  each  point.  It  has  been  shown  that  this  estimator  is  asymptotically  unbiased 
and  consistent.  Discussion  of  different  entropy  estimators  can  be  found  in  [12]  and  references 
therein. 

Length  As  A  Confounding  Factor 

We  applied  our  coordination  measures  to  two  datasets  previously  studied  in  [5]:  oral  tran¬ 
scripts  from  the  Supreme  Court  hearings,  and  discussion  among  Wikipedia  editors.  In  the 
Supreme  Court  Data,  there  are  1 1  Judges  and  311  Lawyers  conversing  with  each  other.  We 
obtain  51,498  utterances  from  all  the  dialogues  among  204  cases.  In  the  Wikipedia  dataset, 
users  are  classified  into  two  categories,  Administrators,  or  Admins,  and  non- Admins.  All  of  the 
users  interact  with  each  other  on  Wikipedia  talk  pages,  where  they  discuss  issues  about  specific 
Wikipedia  pages.  We  focus  on  dialogues  where  each  participant  make  at  least  two  exchanges 
within  a  dialogue,  which  results  in  over  30,000  utterances. 

Ideally,  we  would  like  to  calculate  linguistic  accommodation  between  any  pair  of  individuals 
O  and  R  who  have  participated  in  a  dialogue.  Unfortunately,  most  pair-wise  exchanges  are 
rather  short  and  do  not  produce  sufficient  samples  for  evaluating  mutual  information  or 
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Fig  1.  Coordination  measures  for  the  Supreme  Court  data.  The  red  (blue)  dots  give  the  true  CMI  (Ml).  The  green  dots  represent  CMI  under  the  null 
hypothesis  that  there  is  no  coordination  after  conditioning,  (a)  Lawyers  coordinating  to  Judges,  (b)  Judges  coordinating  to  Lawyers.  In  both  figures,  the 
conditional  mutual  information  is  significantly  smaller  than  the  mutual  information  for  all  eight  stylistic  features,  indicating  length  is  a  confounding  factor. 

doi:10.1371/journal.pone.0130167.g001 


conditional  mutual  information.  Instead,  we  group  the  individuals  according  to  their  roles,  and 
then  use  aggregated  samples  to  calculate  stylistic  coordination  between  the  groups.  The  groups 
correspond  to  Judges  and  Lawyers  for  the  Supreme  Court  data,  and  Admins  and  non- Admins 
for  the  Wikipedia  data. 

Fig  1  describes  stylistic  coordination  for  the  Supreme  Court  data  as  measured  by  1(0  :  R) 
and  1(0  :  R\Lr).  The  bias  in  estimators  for  conditional  mutual  information  and  mutual  infor¬ 
mation  are  generally  different.  Therefore,  rather  than  estimating  mutual  information  directly, 
we  use  a  conditional  mutual  information  estimator  where  we  condition  on  randomly  permuted 
values  for  LR.  We  repeat  this  procedure  for  four  hundred  times  to  produce  99%  confidence 
intervals  for  1(0  :  R)  (blue  bars).  The  green  bars  give  the  99%  confidence  intervals  in  case  there 
is  no  stylistic  coordination  by  estimating  CMI  with  R’s  utterances  permuted  (erasing  any  stylis¬ 
tic  coordination). 

The  blue  dots  show  the  mutual  information  between  the  corresponding  stylistic  features, 
and  suggest  strong  linguistic  correlations  between  the  groups.  This  effect,  however,  is  strongly 
diminished  after  conditioning  on  the  length  of  utterances  (red  dots).  For  instance,  the  coordi¬ 
nation  scores  on  features  Impersonal  Pronoun,  Article,  and  Auxiliary  Verb  are  reduced  by  fac¬ 
tors  of  ~  6.7,  ~  4.8,  and  5.3,  respectively,  after  conditioning  on  length.  For  the  feature 
Conjunction,  the  99%  confidence  interval  of  coordination  score  is  above  the  confidence  interval 
of  zero  information  before  conditioning,  and  falls  into  the  confidence  interval  of  zero  informa¬ 
tion  after  conditioning.  Similarly,  in  Fig  1(b),  the  coordination  scores  for  five  out  of  eight  mark¬ 
ers  (Impersonal  Pronoun,  Article,  Adverb,  Preposition,  Quantifier)  become  practically  zero  after 
conditioning,  suggesting  that  the  observed  coordination  in  those  stylistic  features  are  due  to 
length  correlations. 

A  similar  picture  holds  for  the  Wikipedia  dataset  shown  in  Fig  2.  Again,  we  observe  non¬ 
zero  mutual  information  in  all  the  features.  However,  this  correlation  is  significantly  dimin¬ 
ished  after  conditioning  on  length.  In  fact,  both  non-admins  coordinating  to  admins  (Fig  2(a)) 


PLOS  ONE  |  DOI:1 0.1 371 /journal. pone. 01 301 67  June  26,  2015 


4/15 


PLOS 


ONE 


Understanding  Confounding  Effects  in  Linguistic  Coordination 


z 

o 

5 

z 

5 

DC 

o 

o 

u 


NON-ADMINS  COORDINATING  TO  ADMINS 


#  Conditional  Mutual  Information 

#  Mutual  Information 

#  Zero  Information 


I  I  .  I 

1  l  i  i  1 

i  I  i  I  i  I  | 


9®' 


**  « <nr 


(a) 


ADMINS  COORDINATING  TO  NON-ADMINS 


Z 

o 

5 

z 

5 

DC 

o 

o 

u 


#  Conditional  Mutual  Information 

#  Mutual  Information 

#  Zero  Information 

,  1  •  1  • 

1  • 

1 

•  •  . 

1  1  1  I  1 

1 

•  • 

*  «  1 

?®' 


y3  r 


(b) 


«>°® 

c®® 


.a'0' 


Fig  2.  Coordination  measures  for  the  Wikipedia  data,  (a)  Non-admins  coordinating  to  Admins,  (b)  Admins  coordinating  to  Non-admins.  Symbols  have  the 
same  interpretation  as  in  the  previous  plot. 

doi:10.1371/journal.pone.0130167.g002 


and  admins  coordinating  to  non-admins  (Fig  2(b))  have  an  extremely  weak  signal  after  condi¬ 
tioning  on  length  (all  below  0.005).  In  particular,  for  non-admins  coordinating  to  admins  (Fig 
2(a)),  the  red  dots  of  five  out  of  eight  features  lie  in  the  zero  conditional  mutual  information 
confidence  interval.  For  these  five  features  in  Fig  2(b),  we  cannot  rule  out  the  null  hypothesis 
that  all  stylistic  coordination  is  due  to  phenomenon  of  length  coordination. 

Another  interesting  observation  is  that  there  is  significant  asymmetry,  or  directionality,  in 
stylistic  coordination.  For  instance,  by  comparing  Fig  1(a)  and  1(b)  we  see  that  the  mutual 
information  is  significantly  higher  from  lawyers  to  judges  than  vice  versa.  A  similar  (albeit  less 
pronounced)  asymmetry  is  present  for  the  Wikipedia  data  as  well.  This  type  of  asymmetry  has 
been  used  to  suggest  that  the  relative  strength  of  stylistic  accommodation  reflects  social  status 
[5] .  However,  Figs  1  and  2  illustrate  that  the  asymmetry  is  drastically  weakened  after  condition¬ 
ing  on  length  (red  dots),  suggesting  that  the  phenomenon  of  higher  stylistic  coordination  from 
lawyers  to  judges  (and  from  non-admins  to  admins  for  the  Wikipedia  dataset)  is  due  to  the 
confounding  effect  of  length.  Unfortunately,  a  direct  assessment  of  this  effect  in  a  single  con¬ 
versation  is  not  feasible  due  to  the  insufficient  number  of  utterances  for  calculating  conditional 
mutual  information.  Nevertheless,  in  S2  File  we  suggest  a  different  approach  for  addressing  the 
above  problem,  and  find  that  asymmetry  in  stylistic  coordination  can  be  explained  by  asymme¬ 
try  in  length  coordination. 

To  conclude  this  section,  we  note  that  some  of  the  correlations  in  stylistic  features  persist 
even  after  conditioning  on  length.  One  can  ask  whether  this  remnant  correlation  is  due  to 
turn-by-turn  level  linguistic  coordination,  or  can  be  attributed  to  other  confounding  factors. 
We  address  this  question  in  detail  later  in  the  text. 


Understanding  Length  Coordination 

As  discussed  in  the  previous  section,  the  observed  correlations  in  linguistic  features  can  be 
attributed  to  coordination  in  the  length  of  utterances.  Here  we  analyze  this  phenomenon  in 
more  detail.  In  particular,  we  are  interested  whether  the  observed  length  correlations  are  due  to 
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Fig  3.  A  Bayesian  network  model  for  length  coordination.  The  network  containing  contextual  factors,  C, 
the  length  of  an  utterance,  L(f,  and  the  length  of  the  response,  L$.  (a)  The  lengths  are  correlated  only  due  to 
contextual  factors,  (b)  The  lengths  are  correlated  due  to  both  contextual  factors  and  potential  effect  of  turn-by- 
turn  level  coordination  (represented  with  the  dotted  line). 

doi:10.1 371/journal,  pone.  01 30167.g003 


turn-by-turn  coordination,  or  can  be  attributed  to  other  contextual  factors.  For  instance,  con¬ 
sider  a  scenario  that  in  one  conversation,  Alice  and  Bob  are  always  conversing  using  short  state¬ 
ments,  while  in  another  conversation  they  exclusively  use  long  statements,  perhaps  due  to 
different  topics  of  conversation.  Length  coordination  is  found  if  data  from  these  two  conversa¬ 
tion  is  aggregated,  however,  this  coordination  only  reflects  Alice’s  and  Bob’ s  response  to  the 
topic  of  conversation.  More  generally,  aggregating  data  might  lead  to  effects  similar  to  Simp¬ 
son’s  paradox  [13]. 

To  understand  the  possible  extent  of  various  confounding  factors  (we  call  them  contextual 
factors),  consider  the  Bayesian  network  model  that  incorporates  both  contextual  factors  and 
length  coordination,  as  shown  in  Fig  3.  Here  L0  and  LR  are  random  variables  representing  the 
length  of  an  utterance  by  the  originator  O  and  the  respondent  R,  respectively.  In  the  model 
with  both  solid  and  dashed  lines  in  Fig  3(b),  L0  explicitly  influences  LR.  While  if  we  only  have 
the  soild  lines  in  Fig  3(a),  LR  is  independent  of  La  after  conditioning  on  the  context  C.  Thus, 
the  model  in  Fig  3(a)  assumes  that  there  is  only  contextual  coordination  while  Fig  3(b)  implies 
turn-by-turn  coordination.  Note  that  in  principle,  the  contextual  factor  C  can  vary  within  a  sin¬ 
gle  conversation,  for  example,  the  theme  of  a  conversation  may  change  as  time  goes  by.  But  for 
simplicity,  we  will  assume  that  the  contextual  factor  C  does  not  change  within  the  dialogue  or 
conversation. 

Information-theoretic  characterization  of  length  coordination 

A  direct  measure  of  Turn-by-turn  Length  Coordination  ( TLC )  is  given  by  the  following  condi¬ 
tional  mutual  information: 


TLC  =  I{L0  :  Lr\C) 

Additionally,  we  define  the  Overall  Length  Coordination  (OLC)  as 

OLC  =  I(L0  :  Lr) 


(3) 


(4) 


Thus,  OLC  captures  not  only  the  length  coordination  in  a  turn-by-turn  level,  but  also  the 
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confounding  behaviors  between  L0,  LR  and  C.  In  fact,  OLC  can  be  decomposed  into  two  items: 

OLC  =  TLC  +  I{L0  '■  LR  :  C)  (5) 

The  second  item  of  right  hand  side  in  Eq  5  indicates  the  multivariate  mutual  information 
(MMI)  (also  known  as  interaction  information  [14]  or  co-information  [15]),  and  characterizes 
the  amount  of  shared  information  between  L0,  LR  and  C. 

A  straightforward  method  to  test  for  turn-by-turn  coordination  is  to  evaluate  TLC  described 
in  Eq  3.  Indeed,  L0  and  LR  are  conditionally  independent  of  C  if  and  only  if  TLC  =  0.  However, 
direct  evaluation  of  TLC  is  not  possible  due  to  the  lack  of  sufficient  number  of  samples,  e.g.,  the 
number  of  exchanges  within  a  specific  dialogue.  Nevertheless,  it  is  possible  to  test  the  turn-by- 
turn  length  coordination  by  a  non-parametric  statistical  test  as  shown  below. 


Turn-by-Turn  Length  Coordination  Test 

Our  null  hypothesis  is  that  there  is  no  turn-by-turn  coordination,  so  that  all  observed  correla¬ 
tions  are  due  to  contextual  factors.  We  now  describe  a  procedure  for  testing  this  hypothesis. 

We  denote  the  pairwise  set  of  exchanges  in  a  specific  dialogue  c  from  originator  o  and 
respondent  r  as: 


=  Wc 


rk}K‘ 

’  TC  J  k=  1 


(6) 


where  of  rkc  indicate  the  kth  exchange  (two  utterances)  by  the  originator  o  and  respondent  r  in 
dialogue  c,  and  Kc  represents  the  total  number  of  exchanges  in  c.  We  also  define  the  aggregated 
set  of  exchanges  of  user  o  £  O  and  user  r  £  R  as: 

v,=  U  (?) 

oeO,r£R  ceCo  r 


where  COJ  represents  all  the  dialogues  that  involved  user  o  and  r.  We  can  rewrite  So  «_  r  ele¬ 
ment-wise  as 

V«  =  {0^,CJL  (8) 

where  N  -  |So  <_  R\  representing  number  of  samples.  For  each  triplet  of  right  hand  side  in  Eq  8, 
Rk  is  the  reply  utterance  to  Ok  in  the  dialogue  Ck.  Finally,  from  S0  _  R  we  obtain  the  set 

L(S0^)  =  {len(Ok),len(Rk)}l1  (9) 


where  len  (•)  is  a  function  representing  the  length  of  an  utterance. 

Consider  now  another  sample,  which  is  obtained  by  randomly  permuting  the  respondent  rs 
utterances  in  the  set  Sa  <_  nc: 


-r  =  {0kc,fkc}l  1 


(10) 


where  {rkc}Kf]  is  a  random  permutation  of  {rf}^  r  By  aggregation,  we  have, 

V*=  U  U^r  =  {°*A,Q}L 


oeA.reB  cec„ 


and 

L(S0^R)  =  {len(Ok),len(Rk)}Nk=1  (11) 

Let  us  assume  there  is  no  turn-by-turn  coordination,  so  that  L0  and  LR  are  conditionally  inde¬ 
pendent  from  each  other  given  C.  Then,  it  is  easy  to  see  that  under  this  null  model,  the  samples 
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Fig  4.  Turn-by-turn  length  coordination  test,  (a)  Supreme  Court  dataset,  (b)  Wikipedia  dataset.  In  both  two  subfigures,  OLC ,  is  significantly  smallerthan 
OLC0. 


doi:  1 0. 1 371  /journal.pone.01 301 67.g004 


L(So  <_  r)  and  L(S0^R)  have  the  same  likelihood,  e.g.,  they  are  statistically  equivalent.  In  other 
words,  L(S0^)  can  be  viewed  as  a  new  sample  from  the  same  distribution  p(l„,  lr).  This  obser¬ 
vation  suggests  the  following  test:  We  first  estimate  OLC  from  the  sample  L(S0  <-  r )  (denoted 
as  OLC0)  and  then  using  the  within-dialogue  shuffled  samples  L(S0_R)  (denoted  as  OLC,). 
Under  the  null  hypothesis,  these  two  estimates  should  coincide.  Conversely,  if  OLCo  yf  OLCu 
then  the  null  hypothesis  is  rejected,  suggesting  that  there  is  turn-by-turn  length  coordination. 

The  above  procedure,  which  we  call  Turn-by-Turn  Length  Coordination  Test,  is  a  condi¬ 
tional  Monte  Carlo  test  [16],  The  main  advantage  of  this  non-parametric  test  is  that  it  requires 
a  smaller  sample  size  and  does  not  need  to  make  particular  distribution  assumptions.  The  test 
is  non-parametric  in  two  ways:  the  permutation  procedure  is  non-parametric  as  well  as  the 
estimation  of  mutual  information.  We  also  note  that  in  the  context  of  stylistic  coordination,  a 
similar  test  was  used  in  Ref.  [17]. 

The  results  of  this  test  are  shown  in  Fig  4.  For  the  Supreme  Court  data,  Fig  4(a)  shows  that 
both  Lawyers  to  Judges  and  Judges  to  Lawyers  have  non-zero  mutual  information  (OLC0) 
before  permutation.  The  Turn-by-Turn  Length  Coordination  test  shows  that  the  mutual  infor¬ 
mation  decreases  significantly  after  permutation(green  confidence  intervals,  OLC,),  rejecting 
the  null  hypothesis  that  L0  and  LR  are  independent  after  conditioning  on  the  contextual  factor 
C.  In  other  words,  the  contagion  of  length  exists  from  the  original  utterance  to  the  reply  on  a 
turn-by-turn  level. 

For  the  results  on  the  Wikipedia  discussion  board  in  Fig  4(b),  we  are  also  able  to  reject  the 
null  hypothesis.  Notice  that  the  degree  of  mutual  information  OLC  is  higher  for  Wikipedia 
than  for  the  Supreme  Court.  However,  one  cannot  make  a  general  conclusion  about  the  exact 
magnitude  of  turn-by-turn  length  coordination  (TLC)  simply  by  calculating  the  loss,  i.e.,  OLC0 
-  OLC,. 
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Revisiting  Stylistic  Coordination 

We  demonstrated  in  the  previous  section  that  strong  correlations  in  utterance  length  explain 
most  of  the  observed  stylistic  coordination.  However,  in  some  situations,  there  are  statistically 
significant  non-zero  signals  even  after  conditioning  on  length,  e.g.,  the  first  feature  dimension 
Personal  Pronoun  in  Fig  1(a)  and  1(b).  We  now  proceed  to  examine  this  remnant  coordination. 
Specifically,  we  are  interested  in  the  following  question:  Does  the  non-zero  conditional  mutual 
information  (after  conditioning  on  length)  represent  turn-by-turn  level  stylistic  coordination, 
or  is  it  due  to  other  contextual  factors? 

Toward  this  goal,  consider  the  Bayesian  network  in  Fig  5,  which  depicts  conditional  inde¬ 
pendence  relations  between  the  length  variables  L0  and  LR;  stylistic  variables  F%  and  F™  with 
respect  to  a  style  feature  m,  and  the  contextual  (dialogue)  variable  C.  The  solid  arrow  from  L0 
to  Lr  reflects  our  findings  from  the  last  section  about  the  existence  of  turn-by-turn  length  coor¬ 
dination.  The  dashed  arc  between  the  features  F£  and  F™  characterizes  turn-by-turn  stylistic 
coordination.  Finally,  the  grey  arcs  between  C,  F and  C,  F™  indicate  possible  contextual 
coordination. 


Fig  5.  A  Bayesian  network  for  linguistic  style  coordination.  L0  and  LR  represent  length  of  the  respondent 
and  length  of  the  originator  respectively.  F™  andF™  represent  a  specific  style  feature  variable  for  the 
respondent  and  originator. 

doi:10.1 371/journal,  pone.  01 30167.g005 
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We  use  conditional  mutual  information  to  measure  the  Turn-by-turn  Stylistic  Coordination 
(TSC)  with  respect  to  a  specific  style  feature  m\ 

TSC  =  I(F™  :  F™\C,Lr)  (12) 

where  F™,  F'”  are  binary  variables  indicating  the  feature  m  appears  or  not  in  an  utterance.  Also, 
the  Overall  Stylistic  Coordination(OSC)  is  defined  as 

OSC  =  I{F™-F™\Lr)  (13) 

Thus,  OSC  is  exactly  the  conditional  mutual  information  introduced  in  Eq  2.  Note  that,  even 
after  conditioning  on  length,  fi’"'  and  F‘"  are  still  dependent  of  each  other  because  they  are  shar¬ 
ing  the  contextual  factor  C.  (F”‘  <—  C  FR  is  called  a  d-connected  path  in  [18]) 

Again,  a  direct  measure  of  turn-by-turn  stylistic  coordination  corresponds  to  non-zero  TSC 
in  Eq  12.  However,  TSC  is  hard  to  evaluate  due  to  lack  of  sufficient  samples.  Furthermore,  the 
shuffling  test  from  the  previous  sections  is  not  directly  applicable  here  either,  because  it  needs 
to  be  done  in  way  that  keeps  the  correlations  between  La  and  LR  intact:  In  other  words,  one  can 
exchange  utterances  that  have  the  same  lengths.  Since  most  dialogues  are  rather  short,  this  type 
of  shuffling  test  is  not  feasible,  and  one  needs  an  alternative  approach. 


Turn-by-Turn  Stylistic  Coordination  Test 

Our  proposed  test  is  based  on  the  following  idea:  if  we  can  rule  out  the  influence  of  the  contex¬ 
tual  factors  on  stylistic  correlations,  then  any  non-zero  conditional  mutual  information  can  be 
only  explained  by  turn-by-turn  stylistic  coordination,  i.e.,  OSC  =  TSC.  Thus,  the  null  hypothe¬ 
sis  is  that  there  is  contextual  level  coordination  in  stylistic  features.  We  emphasize  that  by  con¬ 
textual  coordination,  we  are  actually  referring  to  the  links  from  C  to  Fa  and  C  to  FR  in  Fig  5. 

We  follow  the  same  notation  and  methodology  used  in  previous  sections.  By  Eq  8,  let  us 
denote  the  mixed  length  and  stylistic  feature  set  of  So  <_  R  as: 

LFm(S0^R)  =  {len(Ok),len(Rk)Jm(Ok)Jm(Rk),Ck} 

where  fm  (•)  is  a  binary  function  represents  whether  the  style  feature  m  in  an  utterance  appears 
or  not. 

Consider  now  the  shuffling  procedure:  we  randomly  permute  respondent’s  utterances 
within  a  dialogue  and  obtain  the  set  SQ^R  in  Eq  11.  We  also  define  the  length  and  feature  set  of 

LFm(S0^)  =  {len(Ok),len(Rk),fm(Ok),fm(Rk),Ck} 

Clearly,  the  permutation  destroys  the  turn-by-turn  level  coordination  in  both  length  and  style. 
Thus,  any  remnant  correlation  must  be  due  to  contextual  coordination,  e.g.,  the  fork 
Fq  <—  C  — >  FR .  This  provides  a  straightforward  test  for  the  existence  of  contextual  coordination. 
Indeed,  we  simply  need  to  estimate  the  overall  stylistic  coordination  OSQ  using  the  shuffled 
sample  LFm(S0_R).  If  OSQ  is  larger  than  zero,  then  there  is  necessarily  contextual  coordination. 
On  the  other  hand,  if  OSCi  =  0,  then  all  the  observed  stylistic  correlations  (calculated  using  the 
original  non-shuffled  sample)  must  be  due  to  turn-by-turn  stylistic  coordination. 

Let  us  first  consider  the  results  of  the  above  test  for  the  Supreme  Court  data.  From  Fig  6(a) 
and  6(b),  one  can  see  that  for  all  the  features,  the  corresponding  CMI  OSC!  are  within  the  zero- 
information  confidence  intervals,  indicating  that  non-zero  conditional  mutual  information 
(OSC  before  shuffling)  cannot  be  attributed  to  contextual  factors.  In  other  words,  the  remnant 
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Fig  6.  Turn-by-turn  stylistic  coordination  test  for  Supreme  Court  data,  (a)  Lawyers  coordinating  to  Judges,  (b)  Judges  coordinating  to  Lawyers.  (Blue 
bars  indicate  the  overall  stylistic  coordination(OSC)  before  the  test).  One  can  see  that  after  shuffling,  values  of  OSCi  are  within  the  zero-information 
confidence  intervals. 

doi:10.1371/journal.pone.0130167.g006 


correlations  that  are  not  explained  by  length  coordination  must  be  due  to  turn-by-turn  level 
coordination. 

The  situation  is  different  for  Wikipedia  data.  Indeed,  Fig  7(a)  and  7(b)  show  that  for  the  sty¬ 
listic  features  with  statistically  significant  remnant  correlations  even  after  conditioning  on 
length  (OSC),  the  results  of  the  above  permutation  tests  are  rather  inconclusive.  Namely, 
although  the  confidence  intervals  of  OSC!  do  overlap  with  the  zero  information  confidence 
intervals,  one  cannot  state  unequivocally  that  they  are  zero.  In  other  words,  one  cannot  rule 
out  the  null  hypothesis  that  the  remnant  stylistic  coordination  is  due  to  the  contextual  factors 
rather  than  turn-by-turn  coordination. 


Discussion 

In  conclusion,  we  have  suggested  an  information  theoretic  framework  for  measuring  and  ana¬ 
lyzing  stylistic  coordination  in  dialogues.  We  first  extracted  the  stylistic  features  from  the  dia¬ 
logue  of  two  participants  and  then  used  Mutual  Information(MI)  as  a  theoretically  motivated 
measure  of  dependence  to  characterize  the  amount  of  stylistic  coordination  between  the  origi¬ 
nator  and  the  respondent  in  the  dialogue.  Moreover,  by  introducing  Conditional  Mutual  Infor- 
mation(CMI),  which  allows  us  to  measure  the  correlation  between  two  variables  after 
conditioning  on  a  third  variable,  we  are  able  to  more  accurately  gauge  stylistic  accommodation 
by  controlling  for  confounding  effects  like  length  coordination. 

We  then  used  the  proposed  method  to  revisit  some  of  the  previous  studies  that  had  reported 
strong  stylistic  coordination.  While  the  suggestion  that  one  person’s  use  of,  e.g.,  prepositions 
will  (perhaps  unconsciously)  lead  the  other  to  use  more  prepositions  is  fascinating,  our  results 
indicate  that  previous  studies  have  vastly  overstated  the  extent  of  stylistic  coordination.  In  par¬ 
ticular,  we  showed  that  a  significant  part  of  the  observed  stylistic  coordination  can  be  attributed 
to  the  confounding  effect  of  length  coordination.  We  find  that  for  both  Supreme  Court  and 
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Fig  7.  Turn-by-turn  stylistic  coordination  test  for  Wikipedia  data,  (a)  Non-admins  coordinating  to  Admins,  (b)  Admins  coordinating  to  Non-admins.  (Blue 
bars  indicate  the  overall  stylistic  coordination(OSC)  before  the  test).  One  cannot  rule  out  the  null  hypothesis  that  the  remnant  stylistic  coordination  is  due  to 
the  contextual  factors. 


doi:10.1371/journal.pone.0130167.g007 


Wikipedia  data,  the  coordination  score  is  greatly  diminished  after  conditioning  on  length.  We 
also  find  that  the  significant  asymmetry  in  stylistic  coordination  shown  in  the  previous  study 
[5]  is  drastically  weakened  after  conditioning  on  length.  In  fact,  our  results  indicate  that  the 
asymmetry  in  length  coordination  can  explain  almost  all  the  observed  asymmetry  in  stylistic 
coordination. 

Simpson’s  paradox  provides  a  famous  example  of  how  correlations  observed  in  a  population 
can  disappear  or  even  be  reversed  after  conditioning  on  sub-populations.  In  an  information- 
theoretic  framework  setting,  a  similar  “paradox”  can  be  seen  in  the  example  illustrated  by 
Fig  3:  for  L0,  LR  and  C,  the  mutual  information  I(L0  :  LR)  >  0,  while  I{L0 :  LR\C)  =  0.  If  we  only 
look  at  the  aggregated  data,  averaging  over  all  contexts,  C,  i.e.,  I(La  :  LR),  there  will  be  artificial 
mutual  information  between  La  and  LR.  Ideally,  we  could  calculate  I(L0  :  LR\C)  directly,  how¬ 
ever,  there  may  not  be  enough  samples  for  us  to  calculate  the  conditional  mutual  information 
for  all  values  of  C.  How  can  we  still  determine  whether  I(L0  :  LR\C )  is  zero  or  not  while  using 
all  the  data?  We  thus  designed  non-parametric  statistical  tests  to  solve  this  problem  in  general 
while  making  full  use  of  the  available  data.  More  importantly,  because  these  information-theo¬ 
retic  quantities  directly  reflect  constraints  on  graphical  models,  the  mystery  of  Simpson-like 
paradoxes  is  replaced  with  concrete  alternatives  for  generative  stories  as  depicted  in  Figs  3 
and  5. 

We  also  observed  that  for  some  of  the  stylistic  markers,  there  was  diminished  but  still  statis¬ 
tically  significant  correlations  even  after  conditioning  on  length.  We  again  designed  a  non- 
parametric  statistical  test  for  analyzing  this  remnant  coordination  more  thoroughly.  Our  find¬ 
ings  suggest  that  for  the  Supreme  Court  data,  the  remnant  coordination  cannot  be  fully 
explained  by  other  contextual  factors.  Instead,  we  postulate  that  the  remnant  correlations  in 
the  Supreme  Court  data  is  due  to  turn-by-turn  level  coordination.  For  the  Wikipedia  data, 
however,  our  results  are  less  conclusive,  and  we  cannot  draw  any  conclusion  about  turn-by- 
turn  stylistic  coordination.  Thus,  caution  must  be  taken  when  making  general  claims  about  the 
possible  origin  of  stylistic  coordination  in  different  settings. 
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It  is  possible  to  develop  alternative  tests  based  on  a  more  fine-grained,  token-level  generative 
models.  The  main  idea  behind  such  a  test  is  to  shuffle  the  word  tokens  uttered  by  an  individual 
within  each  dialogue,  which  should  destroy  turn-by-turn  coordination.  Our  preliminary  results 
based  on  this  test  suggest  that  most  of  the  remnant  correlations  are  indeed  due  to  turn-by-turn 
coordination.  However,  we  emphasize  that  this  test  requires  an  additional  assumption  whose 
validity  needs  to  be  verified,  namely  that  the  words  used  by  a  given  speaker  within  a  conversa¬ 
tion  are  independent  and  identically  distributed  (i.i.d.).  Furthermore,  the  test  assumes  statio- 
narity,  i.e.,  that  the  contextual  factors  do  not  vary  within  the  course  of  the  dialogue.  While  this 
assumption  seems  reasonable  in  the  dialogue  settings  considered  here,  it  is  important  to  note 
that  deviations  from  stationarity  might  be  yet  another  serious  obstacle  for  identifying  stylistic 
influences  [19,  20],  Indeed,  if  we  relax  the  stationarity  condition,  then  any  observed  correlation 
in  stylistic  features  might  be  due  to  temporal  evolution  rather  than  direct  influence.  And  since 
any  permutation-based  test  destroys  temporal  ordering,  it  cannot  differentiate  between  those 
two  possibilities. 

While  our  work  focuses  on  linguistic  style  matching,  we  believe  that  the  information-theoretic 
method  proposed  can  be  useful  for  studying  more  general  types  of  linguistic  coordination  in  dia¬ 
logues,  such  as  structural  priming  [21, 22],  or  lexical  entrainment  [23, 24].  Recall  that  according 
to  the  structural  priming  hypothesis,  the  presence  of  a  certain  linguistic  structure  in  an  utterance 
affects  the  probability  of  seeing  the  same  structure  later  in  the  dialogue.  This  type  of  turn-by-turn 
coordination  can  naturally  be  captured  by  (time-shifted)  mutual  information  between  properly 
defined  linguistic  variables.  Furthermore,  using  the  permutation  tests  described  here,  it  should  be 
possible  to  differentiate  between  historical  and  ahistorical  mechanisms  of  lexical  entrainment 
[24] .  Indeed,  the  former  mechanism  assumes  some  type  of  influence/coordination  between  the 
speakers  that  helps  them  to  arrive  at  a  common  conceptualization.  The  ahistorical  mechanism, 
on  the  other  hand,  assumes  that  the  speaker’s  choice  of  each  term  is  an  independent  event 
affected  by  the  informativeness  and  availability  of  the  term,  and  some  other  factors,  which,  in  our 
terminology,  is  analogous  to  contextual  coordination. 

In  a  broader  context,  we  note  that  sociolinguistic  analysis  has  been  used  for  assessing  and 
predicting  societally  important  outcomes  such  as  health  behaviors,  suicidal  intent,  and  emo¬ 
tional  well-being,  to  name  a  few  examples  [25-29].  Thus,  it  is  imperative  that  such  predictions 
are  based  on  sound  theoretical  and  methodological  principles.  Here  we  suggest  that  informa¬ 
tion  theory  provides  a  powerful  computational  framework  for  testing  various  hypotheses,  and 
furthermore,  is  flexible  enough  to  account  for  various  confounding  variables.  Recent  advances 
in  information-theoretic  estimation  are  shifting  these  approaches  from  the  theoretical  realm 
into  practical  and  useful  techniques  for  data  analysis.  We  hope  that  this  work  will  contribute  to 
the  development  of  mathematically  principled  tools  that  enable  computational  social  scientists 
to  draw  meaningful  conclusions  from  socio-linguistic  phenomena. 
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